global minimum
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
In recent years, stochastic gradient descent (SGD) based techniques has become the standard tools for training neural networks. However, formal theoretical understanding of why SGD can train neural networks in practice is largely missing. In this paper, we make progress on understanding this mystery by providing a convergence analysis for SGD on a rich subset of two-layer feedforward networks with ReLU activations. This subset is characterized by a special structure called identity mapping. We prove that, if input follows from Gaussian distribution, with standard $O(1/\sqrt{d})$ initialization of the weights, SGD converges to the global minimum in polynomial number of steps.
Deep Learning without Poor Local Minima
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers). Moreover, we prove that the same four statements hold for deep linear neural networks with any depth, any widths and no unrealistic assumptions. As a result, we present an instance, for which we can answer to the following question: how difficult to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima and the property of the saddle points). We note that even though we have advanced the theoretical foundations of deep learning, there is still a gap between theory and practice.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
A Missing statements and proofs 521 A.1 Statements for Section 3.1
Let a two-player Markov game where both players affect the transition. As we have seen in Section 2.1, in the case of unilateral deviation from joint policy Let a (possibly correlated) joint policy ˆ σ . By Lemma A.1, we know that Where the equality holds due to the zero-sum property, (1). An approximate NE is an approximate global minimum. An approximate global minimum is an approximate NE.
- North America > United States > California > Orange County > Irvine (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (2 more...)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada (0.04)
- (2 more...)
- North America > United States > Illinois (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > Canada > Quebec > Montreal (0.04)